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the research field. The face emotions are based on human facial expressions 
which play a crucial role in silent communication. Machine learning 
algorithms have widely used in systems of human facial emotion detection 
from images. However, many systems suffer from low accuracy rate. In this 
paper, we present a system of facial emotion recognition by using images. In 
this proposed system, the samples of facial emotions have taken from Yale 
Face database. In addition, the histograms of oriented gradients (HOG) is used 
to extract features from the images. The extracted features will feed the fast 


learning network (FLN) algorithm for the classification part to identify the 
images of facial emotions with respect to their subjects. Many evaluation 
measurements have used to evaluate the performance of the proposed system. 
Based on the results of the experiment, the proposed system achieves 95.04% 
for the highest accuracy, 72.73% precision. Also, the results of the proposed 
system in terms of recall, f-measure, and G-main are all equal to 72.73%, 
respectively. 
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1. INTRODUCTION 

The detection of human emotions is an active area and has witnessed a high significance recently. In 
general, the facial emotion of human can be classified into positive emotions and negative emotions. The 
positive emotions are including joy, love, and happiness. On the other hand, negative emotions are denoting to 
anger, lonely, disgust, fear, and rage [1]. Typically, in order to propose a technique for detecting facial emotion, 
there are three main phases that need to be performed. The first phase refers to the raw images database that is 
taken to be used in the detection of facial emotion. Subsequently, the second phase denotes to the feature 
extraction process in order to extract features from images. Finally, these extracted features will be fed into the 
classifier in order to identify the user emotion [2]. In sociality, humans are interacting with the help of facial 
emotions, where these emotions are considered as a universal language. Furthermore, these emotions transcend 
cultural diversities and ethnicity. 

The expressions of the human face are responsible for conveying the information. These expressions 
represent the mental state of the people which are directly related to the physical efforts that they should be 
applying in order to perform tasks or just related to their intentions. Consequently, techniques of automatic 
emotion recognition with the cooperation of high-quality sensors can be a useful tool in different fields such 
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as robotics, image processing, cyber security, and other various applications of virtual reality [3]. Machine 
learning (ML) and deep learning (DL) algorithms have been extensively applied due to such algorithms have 
proved their effectiveness and efficiency in the classification between subjects [4]. Moreover, the algorithms 
of ML and DL have improved the performance of many recent systems and in different fields such as images 
classification in the medical domain [5]-[7], language identification [8], [9], fog-cloud network [10], 
identification of spam emails [11], speech emotion recognition [12], vehicle detection [13] and voice pathology 
detection [14]-[16]. Additionally, the algorithms of DL and ML have been implemented as the main role in the 
methods of facial emotion recognition [17], [18]. The main purpose of using these algorithms is to train and 
build a system that is capable to classify subjects efficiently with high accuracy of the detection [19], [20]. The 
most challenges that faced by researchers in the domain of facial emotions detection is the shortage of the 
database for spontaneous expression of people. In other words, collecting and capturing spontaneous 
expressions of people on images is considered as the hugest challenge for researchers [21]. Lately, there are 
many systems and techniques have been presented in the state-of-the-art for the detection and recognition of 
facial emotion by using an image database. 

Moghaddam et al. [22] is presented a new deep network for the recognition of facial emotion by using 
images. In this method, the deep network is used to extract spatial features by utilizing a VGG16 convolutional 
neural network (CNN). Subsequently, a neural network that named bidirectional long short-term memory 
(Bi-LSTM) is applied in order to learn features of spatio-angular. Furthermore, the extracted features have fed 
to the fusion scheme used for classifying the facial emotion recognition and obtain the results of the system. In 
this method, four facial emotions have been used which are natural, angry, surprised, and happy. The samples 
of images database have been taken from light field face database (LFFD). It includes 800 samples of facial 
emotion images. The achieved results have shown that the highest accuracy has reached to 94.00% that has 
been obtained from the happy subject. However, this method has been evaluated in terms of the accuracy only, 
where there are other important measurements such as sensitivity, specificity, and G-mean. 

Another research paper has proposed in [23]. The authors have been presented an adversarial attack 
resistant based system for analysing and recognizing faces emotion through the landmarks. The proposed 
system has been outperformed the ResNet model in terms of the images classification in different cases after 
the data has faced an adversarial attack. Furthermore, the images samples have been taken from the Cohn- 
Kanade database. This image database contains 3,368 of facial emotions. There were seven facial emotions 
that have been included in the database which are anger, fear, sad, disgust, neutral, surprise, and happy. The 
proposed method has achieved 97.43% accuracy, while the ResNet has achieved 90% after the adversarial 
attack. 

Pandey et al. [24] have been proposed a system for improving the performance of facial emotion 
recognition by using laplacian and gradient images. In this system, the images of laplacian and gradient have 
used as input data along with the original input into the CNN algorithm. These laplacian and gradient images 
have helped the network to learn with additional information. Besides, there were two well-known databases 
of facial images have used in this proposed system which are Face expression recognition (FERplus) and 
Karolinska directed emotional faces (KDEF). The total number of images in the Karolinska directed emotional 
faces (KDEF) database is 4,900 images, with an equal number of facial expressions for female and male. This 
database includes seven classes of expressions which are anger, fear, sadness, disgust, neutral, surprise, and 
happiness. Whilst, the total number of images in the FERplus database is 35,000 and it contains eight subjects, 
including contempt. In the proposed system, there were 28,000 samples of facial emotion images have used 
for the training phase, while the remaining images samples have divided equally for the validation and test 
phases. The experimental results have shown that this system has achieved 88.16% accuracy. However, the 
proposed system has evaluated in terms of accuracy only. Moreover, the obtained results are not encouraging 
yet in the classification process. 

Fang [25] has presented a method in order to solve the image classification problem. In other words, 
to solve the problem of miss-classified cases in the recognition of facial emotion. In particular, the author has 
presented the backtracking algorithm for tracking down the activated pixels of the image between the last layers 
of feature maps. Subsequently, the facial features which are considered as miss-classifications will be 
visualized. In the recognition of facial emotion, the activations of the unique image pixels identify the 
classification results. The samples of facial emotion have been taken from radbound faces database (RaFD). In 
this database, there are 67 subjects for male and female and of different ages. Besides, for each subject, there 
were 120 samples of images. The facial emotions used in this method are neutral, surprise, happiness, disgust, 
contempt, anger, sadness, and fear. This method has achieved 90.97% classification accuracy for the 
recognition of facial human emotions. 
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2. METHOD 

In this section, we will present the methodology of the proposed system for the recognition of facial 
emotion. The samples of the database are images of human faces in different emotions that include 11 subjects. 
These images samples will be analysed in the pre-processing process. Subsequently, the technique of 
histograms of oriented gradients (HOG) will be performed as a feature extraction method in order to extract 
the features of facial emotion images. It worth mention, the pre-processing and HOG processes are representing 
the entire features extraction step. Finally, the extracted features will be fed to the fast-learning network (FLN) 
algorithm to classify the images of human facial emotions. Furthermore, Figure | shows the general scheme 
of the proposed method. 
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Figure 1. The general scheme of the proposed system 


2.1. Yale faces images database 

In our proposed system, the samples of facial emotions images have been taken from [26]. This 
database is called Yale faces and it is a well-known database, where it has been widely used in the recognition 
of facial emotions. The Yale face database consists of 11 subjects which are centre light, glasses, happy, left 
light, no glasses, normal, right light, sad, sleepy, surprised, and wink. Each subject contains 15 samples of 
human facial emotions images. Consequently, the total number of images is 165 samples in the Yale face 
database. The original images samples of Yale faces database have processed as an 8-bit 3D images with size 
of 231x195x3. In our experiment, we have been used all the samples of human facial emotions images in the 
Yale faces database. Furthermore, it worth mention that all images samples have converted and resized (see 
section 2.2 for the pre-processing step). In addition, we have divided the entire database into 80% (132 samples) 
for the training phase, and the remaining 20% have used for the testing phase (33 samples). In other words, the 
training set of each class has included 12 samples. While, the testing set of each class has included 3 samples. 
Table | illustrates the Yale faces database used in the proposed system. 


2.2. Pre-processing 

The pre-processing operation of the human facial emotions images in this study is consist of two main 
steps. The conversion and the image resize are considered the two main steps of the pre-processing operation. 
In the step of conversion, all the images of human facial emotions will be read and check their dimensionality. 
That means the 3D human facial emotions images will be converted into grayscale (2D). Whilst in the step of 
image resize, the dimensionality of all human facial emotions images will be resized into (150x150) 
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dimensions. Thereafter, the output of the pre-processing operation will be as input into the HOG technique in 
order to extract the needed features from the facial emotions images. 


Table 1. Details of the Yale faces database 


Images samples of Yale faces database __ Label 


Class name Number of sample 


Centre Light 15 ee é 2 1 
Glasses 15 a> 2 
a a! _ 
Happy 15 > 3 
4 a — 
Left Light 15 4 
4 
“a a = 
No Glasses 15 Q ay 5 
1 ad _ 
Normal 15 & a4 6 
. i _ 
Right Light 15 7 
y 
A 
: e © 9g | 
* aad h 
Sleepy 15 fs 9 
tL 2) aa _ 
Surprised 15 10 
d = a 
Wink 15 . 1 
2 2 ad _ 


2.3. Feature extraction: HOG technique 

The HOG technique relies on the accumulation of gradient directions through the image pixel for a 
certain region named "Cell". In the following construction for the histogram with one dimension which provides 
a series of features vector in order to be considered as input for the classification process. Suppose G refers to 
the function of the grayscale which has been utilized for analysing and describing images. Further, every image 
will be split into a set of cells with NxN pixels’ size. Figure 2(1) presents the processes of image splitting into 
a set of cells. The calculation of the gradient orientation (i.e., 8k, r) for each pixel is represented in (1). Figure 
2(II and II) demonstrates the processes of the gradient orientation. 
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= -1 G(k,r+1)-G(k,r-1) 
Our = tan G(k+1,r)—G(k-1,r) (1) 


Furthermore, the orientations 6} i =1...N? for the same cell j are accumulated and quantized into an 
M-bins histogram as presented in Figure 2(I[V and V). In the final step, all the obtained histograms will be 
ordered and concatenated into a HOG histogram as a final output of the process of features extraction as 
depicted in Figure 2(VJ). Figure 2 has provided an example of 4 pixels’ cell size and 8 orientation bins for the 
cell histograms. 
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Figure 2. The HOG diagram 


2.4. Classification: FLN algorithm 

The fast learning network (FLN) is a novel double parallel forward artificial neural network proposed 
in [27]. The FLN algorithm is based on the methods of least squares. The diagram of the FLN is presented in 
Figure 3 and following that, a deep description of the FLN is provided. 


Input Hidden Output 
layer layer layer 


Figure 3. The FLN diagram 


Suppose N is the arbitrary distinct samples {xi, yi}. 
where: 
7 Xi = [Xi, Xi2, ..., Xin] ER" is ith training sample with an n vector dimension; and 
7 yi= bin yi2, .., vil"€ R'is the ith target vector with an / vector dimension. 
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The FLN algorithm has m neurons in the hidden layer. The W" is a matrix of the input weights with 
(mxn) dimension which connects the neurons of the input and hidden layers. While b=[b,, bz, ..., bm] is a 
matrix of the hidden neurons biases. W°" is a matrix of weights with (/xm) dimension which connects the 
hidden and output layers. W” is a matrix of weights with (/xn) dimension which connects the neurons of the 
input and output layers. c=[c;, c2, ... ci|’ is a matrix of the output biases neurons. The active functions of the 
output neurons and hidden neurons are f(-) and g(-), respectively. When the output biases neurons c=[c;, c2, 
.. ci]? are set equal to zeros, it will be ignored in the active function of the output neurons. Therefore, the 
mathematical model of the FLN algorithm is show as the following: 


Vj = Yr=1 WE Xr tO Diced Weg (be ope ee Wit xje) 
Vj2 = ot Wor Xjr eh pe Weg (be sp | Wit xjt) ; 


: J =1,2,-,N (2) 
Vit = as Wo! Xjr +e Dei Wig (Di i Wit Xj) 
as well as, it can be represented as shown in (3). 
yj = f (w°'x, +ct+ ei we" g(Wi"x; + bi) »f =1,2,-°,N (3) 


where: 
we = [we', Wie Nas we" ], it refers to the vector of weights that connecting the jth input neuron and 
T; ‘ : 
the output neurons. W2" = [W2",W2",...,W2"] , it refers to the vector of weights that connecting the kth 
hidden neuron and the output neurons. Also, W,”" = [Wit, Wz, ..., Win] , it refers to the vector of weights 


that connecting the kth hidden neuron and the input neurons. The output of the hidden layer neurons (G) are 
calculated as (4). 


G(Wi'x, +b) g(Wixy +b) 
G(wi, we, byt, Dm Xp Xu) — ; : = ; : (4) 
gWrx, + bm) ra gWrxn + Dn) mxN 


The matrix of the output weights W = [W°'W°"] can be determined via the inverse of the Moore— 
Penrose generalization as shown in (5). 


W-=yY Bp = YH* where H = ("| (5) 


The W% and W° are calculated as shown in (6). 


Ww” = W(1:1,1:n) 6 

WwW" = W(1:l,n+ 1:(n+m)) 
Assume that the N is the given training set, N = {(x;, y;) | x; € R”,y; € R'}, the g(-) is the activation function, 
and the m is the number of hidden layer neurons. Subsequently, the learning procedure of the FLN algorithm will 
be summarized as following steps: i) generate the matrixes of the input weights and biases (W and b) randomly, 
ii) calculate the matrix of the hidden layer output by utilising (4), iii) compute the combination matrix (W) by 
utilising (5), and iv) identify the parameters model of the FLN algorithm by utilising (6). 


3. RESULTS AND DISCUSSION 

Facial emotion detection based on human face images is considered the main objective of this 
proposed system. In this system, we have used Yale faces database as input data. The features of these images 
samples have extracted using the HOG technique. Subsequently, the FLN algorithm has been performed to 
classify the facial emotions images according to their proper subjects. There were 11 subjects have included in 
this system which are Centre Light (Cen. Lig.), Glasses (Gla.), Happy (Hap.), Left Light (Lef. Lig.), No Glasses 
(No Gla.), Normal (Nor.), Right Light (Rig. Lig.), Sad, Sleepy (Sle.), Surprised (Sur.), and Wink (Win.). The 
emotion database has divided into 80% training and 20% testing. In addition, the experiments of this proposed 
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system have implemented based on a different number of hidden neurons of the FLN algorithm. In other words, 
the experiments have started when the number of hidden neurons was 100 and ended when the number of 
hidden neurons was 600 with the increment step of 50. Therefore, the total number of implemented experiments 
is 11 times. Besides, there were various of the evaluation measures have used to evaluate the performance of 
the proposed system in the detection of human facial emotions such as accuracy, precision, recall (i.e., 
sensitivity), F-measure, and G-mean, as shown in (7)-(11) [28]. The experiments of this proposed system have 
carried out by using MATLAB. 


TP+TN 


Accuracy = ———__—_- (7) 
TP +TN+FP+FN 
aoe TP 
Precision = ——— (8) 
TP +FP 
TP 
Recall = (9) 
TP+FN 
2 x Precision Xx Recall 
F — measure = (10) 


Recall + Precision 


G — Mean = */Specificity x Recall (11) 


Where: true positif (TP) and true negative (TN) are denote to the true positive and the true negative, 
respectively. FP refers to the false positive and FN refers to the false negative. Table 2 shows the achieved 
result of the proposed system in the detection of human facial emotions. Based on the results of the experiments, 
the proposed system has achieved the highest results when the number of hidden neurons of the foundational 
literacy and numeracy (FLN) algorithm was 500. The highest achieved results of the accuracy and precision 
were 95.04% and 72.73%, respectively. Furthermore, the highest obtained results of recall, f-measure, and G- 
mean were all equal to 72.73%. However, the proposed system using the FLN algorithm has achieved the 
lowest obtained results when the number of hidden neurons was 100. In this case, the obtained accuracy was 
equal to 91.74%. While the obtained results of precision, recall, f-measure, and G-mean were all equal to 
54.55%. In addition, the achieved results of the proposed system for each class of facial emotion expression 
are shown in Table 3. The proposed system using the FLN algorithm has obtained the highest results for happy, 
left light, and the right light, where the results of accuracy, precision, recall, f-measure, and G-mean for all 
these classes were all equal to 100.00%. In this regard, the results of the proposed system have shown 
encouraging results for the recognition of facial human emotions. Furthermore, the confusion matrix of the 
proposed system is alliterated in Table 4. 

In order to evaluate our proposed system with other methods, we have compared the performance of 
the proposed system with the methods in [29]-[33] in terms of the detection accuracy for distinguishing the 
human facial emotion from images. These methods have proposed various techniques of machine learning 
algorithms and deep learning for facial emotion recognition by using images as input data. Besides, all these 
methods have used the Yale faces database for the purpose of training and testing their systems. The results 
have shown that the performance of the proposed system using the FLN algorithm has outperformed all 
methods in terms of accuracy in the domain of emotion recognition. Table 5 shows the comparison between 
methods. 


Table 2. The achieved results of the proposed system 
Hidden Neurons Number TP TN FP FN _ Accuracy Precision Recall F-measure _G-mean 


100 18 315 15 15 91.74 54.55 54.55 54.55 54.55 
150 23, 320 «10 = 10 94.49 69.70 69.70 69.70 69.70 
200 22 319 Jl 11 93.94 66.67 66.67 66.67 66.67 
250 23, 320 10 = 10 94.49 69.70 69.70 69.70 69.70 
300 21 318 12 12 93.39 63.64 63.64 63.64 63.64 
350 20 317) 13 13 92.84 60.61 60.61 60.61 60.61 
400 22 319 11 11 93.94 66.67 66.67 66.67 66.67 
450 21 318 12 = 12 93.39 63.64 63.64 63.64 63.64 
500 24 321 9 9 95.04 72.73 72.73 72.73 72.73 
550 22 319 Jl 11 93.94 66.67 66.67 66.67 66.67 
600 21 318 =12 ~=12 93.39 63.64 63.64 63.64 63.64 


Indonesian J Elec Eng & Comp Sci, Vol. 30, No. 3, June 2023: 1478-1487 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 O 1485 


Table 3. The achieved results of the proposed system for each class 


Emotion Expression TP TN FP FN Accuracy _ Precision Recall F-measure _ G-mean 
Centre light 2 30.0 1 96.97 100.00 66.67 80.00 81.65 
Glasses 3 26 «4 0 87.88 42.86 100.00 60.00 65.47 
Happy 3 30. = 0 0 100.00 100.00 100.00 100.00 100.00 
Left light 3 30.0 0 100.00 100.00 100.00 100.00 100.00 
No glasses 0 30 0 3 90.91 0 0 0 0 
normal 1 30. — 0 2 93.94 100.00 33.33 50.00 57.74 
Right light 5 30. 0 0 100.00 100.00 100.00 100.00 100.00 
sad 2 29 1 1 93.94 66.67 66.67 66.67 66.67 
Sleepy 2 30.0 1 96.97 100.00 66.67 80.00 81.65 
Surprised 2 30.0 1 96.97 100.00 66.67 80.00 81.65 
Wink 3 26 = 4 0 87.88 42.86 100.00 60.00 65.47 


Table 4. The confusion matrix of the proposed system 
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Table 5. The comparison between methods 


Methods Accuracy 
Our Method using FLN algorithm 95.04% 
Li et al. [29] 90.3% 
Cao et al. [30] 74.55% 
Aly [31] 91.7% 
Dandpat and Meher [32] 92.8% 
Zhou et al. [33] 93.33% 


4. CONCLUSION 

Machine learning algorithms have played a vital role in systems of facial emotion recognition from 
images. These algorithms are considered as the main part in such systems. However, there is a need to 
investigate other algorithms of machine learning in the recognition of facial emotions. This paper has presented 
a system of facial emotions recognition from images. The samples of facial images have taken from Yale face 
database. In this proposed system, there were 11 expressions have used for human facial emotions which are 
centre light, glasses, happy, left light, no glasses, normal, right light, sad, sleepy, surprised, and wink. 
Furthermore, the HOG technique has used as feature extraction to extract the needed features from facial 
images. While the FLN algorithm has used as a classifier to identify the images of the facial emotions with 
respect to their subjects. Based on the experiments, the results of the proposed system have shown that the 
highest accuracy is reached to 95.04%. Besides, the highest result of precision, recall, e-measure, and G-mean 
were all equal to 72.73%, respectively. The performance of the proposed system has shown promising results 
in the recognition of facial emotions from images. In future work, we can use the FLN algorithm in a different 
database of facial emotions. 
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